Address tokio contention messages #6279

lutter · 2026-01-16T18:56:57Z

Replaces std::sync::RwLock with parking_lot::RwLock for pool metrics

Use parking_lot::RwLock instead of std::sync::RwLock for connection pool metric recording. parking_lot::RwLock is faster for short-held locks as it uses efficient spinning before parking, reducing tokio worker thread blocking during metric recording.

This change helps reduce tokio threadpool contention when the connection pool is under heavy load, as the metric recording locks are held for only microseconds.

…pool metrics Use parking_lot::RwLock instead of std::sync::RwLock for connection pool metric recording. parking_lot::RwLock is faster for short-held locks as it uses efficient spinning before parking, reducing tokio worker thread blocking during metric recording. This change helps reduce tokio threadpool contention when the connection pool is under heavy load, as the metric recording locks are held for only microseconds. Co-Authored-By: Claude Opus 4.5 <[email protected]>

These locks are accessed on every GraphQL query, so using the faster parking_lot::RwLock reduces lock contention in the query path. Co-Authored-By: Claude Opus 4.5 <[email protected]>

…ents Replace std::sync::RwLock with parking_lot::RwLock in the SubscriptionManager to reduce lock contention. parking_lot's RwLock is faster for short-held locks due to efficient spinning before parking, which helps reduce tokio threadpool contention. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Replace std::sync::RwLock with parking_lot::RwLock in the background writer's Request::Write batch handling. This reduces lock contention as parking_lot's RwLock is faster for short-held locks due to efficient spinning before parking. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Replace std::sync::RwLock with parking_lot::RwLock in TimedCache for faster lock acquisition on cache gets and sets. parking_lot's RwLock uses efficient spinning before parking, reducing contention. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Replace std::sync::RwLock with parking_lot::RwLock for the chain stores map in BlockStore. This reduces lock contention when looking up or modifying chain stores. Co-Authored-By: Claude Opus 4.5 <[email protected]>

…egistry Replace std::sync::RwLock with parking_lot::RwLock for the global metrics caches in MetricsRegistry. This reduces lock contention when registering or looking up global metrics. Co-Authored-By: Claude Opus 4.5 <[email protected]>

…eepAlive Replace std::sync::RwLock with parking_lot::RwLock for the alive_map in SubgraphKeepAlive. This reduces lock contention when tracking running subgraph deployments. Co-Authored-By: Claude Opus 4.5 <[email protected]>

With many subgraphs, chain_head_ptr() was querying the database on every call, leading to connection pool saturation. This adds an adaptive cache that learns optimal TTL from observed block frequency. The cache uses EWMA to estimate block time and sets TTL to 1/4 of that estimate (bounded by 20ms-2000ms). During warmup (first 5 blocks), it uses the minimum TTL to avoid missing blocks on unknown chains. New metrics: - chain_head_ptr_cache_hits: cache hit counter - chain_head_ptr_cache_misses: cache miss counter (DB queries) - chain_head_ptr_cache_block_time_ms: estimated block time per chain Safety escape hatch: set GRAPH_STORE_DISABLE_CHAIN_HEAD_PTR_CACHE=true to revert to the previous uncached behavior. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Replace RwLock<MovingStats> with a lock-free AtomicMovingStats that uses an atomic ring buffer with packed bins. Each bin packs epoch (32 bits), count (32 bits), and duration_nanos (64 bits) into a single AtomicU128 for lock-free CAS updates. This eliminates lock contention when many threads write concurrently (every semaphore wait, connection checkout, query execution) while reducing memory usage by 2x (4.8KB vs 9.6KB per stats instance). Co-Authored-By: Claude Opus 4.5 <[email protected]>

The ChainHeadPtrCache introduced in 7ecdbda can cause connection pool exhaustion when the cache expires: multiple concurrent callers each acquire a database connection, then block waiting for a write lock to update the cache - while still holding their connections. This adds a HerdCache layer that ensures only one caller queries the database when the TTL cache expires. Other concurrent callers await the in-flight query result instead of each acquiring their own connection. Co-Authored-By: Claude Opus 4.5 <[email protected]>

…cks for active connections Connections that were used within the last 30 seconds (configurable via GRAPH_STORE_CONNECTION_VALIDATION_IDLE_SECS) now skip the SELECT 67 health check during pool recycle. This reduces connection checkout latency from ~4ms to ~0ms for frequently-used connections while still validating idle connections to detect stale database connections. Co-Authored-By: Claude Opus 4.5 <[email protected]>

lutter and others added 8 commits January 16, 2026 12:33

graph: Replace std::sync::RwLock with parking_lot::RwLock in LoadManager

cdb894a

These locks are accessed on every GraphQL query, so using the faster parking_lot::RwLock reduces lock contention in the query path. Co-Authored-By: Claude Opus 4.5 <[email protected]>

store: Replace std::sync::RwLock with parking_lot::RwLock in BlockStore

6e2009b

Replace std::sync::RwLock with parking_lot::RwLock for the chain stores map in BlockStore. This reduces lock contention when looking up or modifying chain stores. Co-Authored-By: Claude Opus 4.5 <[email protected]>

lutter force-pushed the lutter/block branch 2 times, most recently from ab17afd to a6245db Compare January 16, 2026 22:38

lutter force-pushed the lutter/block branch from a6245db to 7ecdbda Compare January 16, 2026 23:36

lutter and others added 2 commits January 17, 2026 15:47

lutter force-pushed the lutter/block branch from b72c0be to a4cb593 Compare January 18, 2026 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Address tokio contention messages #6279

Address tokio contention messages #6279

lutter commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Address tokio contention messages #6279

Are you sure you want to change the base?

Address tokio contention messages #6279

Conversation

lutter commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants